Methodological Guidelines and Adaptive Statistical Data Validation to Build Effective Data Warehouses
نویسندگان
چکیده
Over time, data integration involving data warehouses is becoming more difficult to develop and to manage due to the growing heterogeneity of data sources. Despite the significant advances in research and technologies, many integration projects are still too slow to generate pragmatic results and are often abandoned before that. The objective of this work is the specification of a developing strategy to build effective data warehouse integration systems with progressive increase of data quality. For the achievement of our goal, we propose some methodological guidelines for the construction of evolutionary integration systems and the development of a framework for automated statistical validation that will facilitate the enhancement of system’s data quality.
منابع مشابه
Long-term Streamflow Forecasting by Adaptive Neuro-Fuzzy Inference System Using K-fold Cross-validation: (Case Study: Taleghan Basin, Iran)
Streamflow forecasting has an important role in water resource management (e.g. flood control, drought management, reservoir design, etc.). In this paper, the application of Adaptive Neuro Fuzzy Inference System (ANFIS) is used for long-term streamflow forecasting (monthly, seasonal) and moreover, cross-validation method (K-fold) is investigated to evaluate test-training data in the model.Then,...
متن کاملEnsemble strategies to build neural network to facilitate decision making
There are three major strategies to form neural network ensembles. The simplest one is the Cross Validation strategy in which all members are trained with the same training data. Bagging and boosting strategies pro-duce perturbed sample from training data. This paper provides an ideal model based on two important factors: activation function and number of neurons in the hidden layer and based u...
متن کاملIdentification and validation characteristics of effective teacher primary Period
The main objective of This research was to identify and validate the characteristics of the effective teacher in the elementary period.The first part of the Q-method was used. The statistical population in this section was all faculty members of educational sciences of Tehran universities(N=349),that20of them were selected according to the purposive sampling and snowball.The Data collection too...
متن کاملAn Adaptive Approach to Increase Accuracy of Forward Algorithm for Solving Evaluation Problems on Unstable Statistical Data Set
Nowadays, Hidden Markov models are extensively utilized for modeling stochastic processes. These models help researchers establish and implement the desired theoretical foundations using Markov algorithms such as Forward one. however, Using Stability hypothesis and the mean statistic for determining the values of Markov functions on unstable statistical data set has led to a significant reducti...
متن کاملConceptual Design of XML Document Warehouses
EXtensible Markup Language (XML) has emerged as the dominant standard in describing and exchanging data among heterogeneous data sources. XML with its self-describing hierarchical structure and its associated XML Schema (XSD) provides the flexibility and the manipulative power needed to accommodate complex, disconnected, heterogeneous data. The issue of large volume of data appearing deserves i...
متن کامل